[GitHub] spark pull request: [SPARK-3694] RDD and Task serialization debugg...

2015-01-17 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3518#issuecomment-70394250 Hey @ilganeli - I took a slightly deeper look this time. I still don't totally follow how this all hooks together, but I wonder if it's possible to write a single

[GitHub] spark pull request: [SPARK-5208][DOC] Add more documentation to Ne...

2015-01-17 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4012#issuecomment-70395228 @sarutak when we added the netty shuffle we actually decided not to expose these in order to keep the overall # of configurations manageable. We couldn't think

[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...

2015-01-17 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70395195 @ScrapCodes mind bringing up to date? The current form LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-5289]: Backport publishing of repl, yar...

2015-01-17 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4079#issuecomment-70390243 @vanzin I tried to cover it in #4080 - but basically there were changes you made that were anyways being requested by others in the community (asking us to publish

[GitHub] spark pull request: SPARK-5270 [CORE] Elegantly check if RDD is em...

2015-01-17 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4074#discussion_r23129615 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala --- @@ -436,6 +436,12 @@ trait JavaRDDLike[T, This : JavaRDDLike[T

[GitHub] spark pull request: [SPARK-3880] HBase as data source to SparkSQL

2015-01-17 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4084#issuecomment-70390848 Also one thing that would help is if you could create a standalone project for this on github (see spark-avro). --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-5096] Use sbt tasks instead of vals to ...

2015-01-17 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3905#issuecomment-70391899 I thin this is good for now, I can pull it in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-3694] RDD and Task serialization debugg...

2015-01-17 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3518#issuecomment-70399358 Hey so it looks like while I was reviewing this patch @rxin actually ran into this and just wrote a fix himself (#4093). That fix is actually even simpler than what I

[GitHub] spark pull request: Spark 3883: SSL support for HttpServer and Akk...

2015-01-17 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3571#issuecomment-70391071 Actually @JoshRosen - can you take a look at this one? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-5289]: Backport publishing of repl, yar...

2015-01-17 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4079#issuecomment-70391965 Okay I'm pulling these two in. I think we're all on the same page. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-17 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4020#discussion_r23130489 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -44,42 +44,62 @@ class TaskMetrics extends Serializable

[GitHub] spark pull request: SPARK-5199. Input metrics should show up for I...

2015-01-17 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4050#discussion_r23131557 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -219,6 +220,9 @@ class HadoopRDD[K, V]( val bytesReadCallback

[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70220172 @ScrapCodes Rather than change this to +LinkedHashMap+ can you just check if it contains it before removing it? It might not be obvious to developers that +remove+ has

[GitHub] spark pull request: [SPARK-5214][Core] Add EventLoop and change DA...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4016#issuecomment-70220476 I think if we have a stated goal of moving away from akka (which would eventually move all of our event loops into this structure) then it makes some sense to do

[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70220952 ah I see - if the existing remove call is safe, then I think it's fine. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...

2015-01-16 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4043#discussion_r23070055 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllStagesPage.scala --- @@ -37,12 +37,18 @@ private[ui] class AllStagesPage(parent: StagesTab

[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...

2015-01-16 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4043#discussion_r23070066 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllStagesPage.scala --- @@ -37,12 +37,18 @@ private[ui] class AllStagesPage(parent: StagesTab

[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...

2015-01-16 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4043#discussion_r23070082 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala --- @@ -17,7 +17,7 @@ package org.apache.spark.ui.jobs

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70229214 What about combining the input size and records in the same column. Overall this will help with the expansion in the number of columns. The title could be Input Size

[GitHub] spark pull request: [SPARK-4923][REPL] Add Developer API to REPL t...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4034#issuecomment-70229681 Jenkins ok to test. Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-3996: Shade Jetty in Spark deliverables.

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3130#issuecomment-70231376 Hey All, I don't see where spark-submit fits into the issue of having a conflicting jetty library? If you have an application that requires a conflicting

[GitHub] spark pull request: [SPARK-733] Add documentation on use of accumu...

2015-01-16 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4022#discussion_r23072648 --- Diff: docs/programming-guide.md --- @@ -1316,7 +1316,35 @@ For accumulator updates performed inside bactions only/b, Spark guarantees t will only

[GitHub] spark pull request: [SPARK-733] Add documentation on use of accumu...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4022#issuecomment-70299057 Okay new version LGTM! Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-3996: Shade Jetty in Spark deliverables.

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3130#issuecomment-70299385 One case I suppose where it could be relevant is if we only shaded Jetty inside of our assembly jar. That would only help users of spark-submit (also, it would be much

[GitHub] spark pull request: SPARK-4746 make it easy to skip IntegrationTes...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4048#issuecomment-70300365 @squito yeah for sure, I think it's nice to have some set of integration tests that are flagged as such. I was just suggesting other ways we can improve our tests

[GitHub] spark pull request: SPARK-4585. Spark dynamic executor allocation ...

2015-01-16 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4051#discussion_r23099965 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala --- @@ -73,12 +73,12 @@ private[spark] class ClientArguments(args: Array

[GitHub] spark pull request: SPARK-4585. Spark dynamic executor allocation ...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4051#issuecomment-70300840 It seems fine to me to just change the defaults. This is an experimental feature in 1.2 and people will just be kicking the tires. If we get some user feedback

[GitHub] spark pull request: SPARK-5270 [CORE] Elegantly check if RDD is em...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4074#issuecomment-70310222 Seems reasonable to have since it's non obvious how to do it - @srowen could you add this in Java and Python? --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-4923][REPL] Add Developer API to REPL t...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4034#issuecomment-70320813 Thanks Chip - I will pull this in. After some more thought on this, I'm just going to pull this into master for 1.3+ and for 1.2 we'll just publish the original REPL

[GitHub] spark pull request: [HOTFIX]: Minor clean up regarding skipped art...

2015-01-16 Thread pwendell
GitHub user pwendell opened a pull request: https://github.com/apache/spark/pull/4078 [HOTFIX]: Minor clean up regarding skipped artifacts in build files. There are two relevant 'skip' configurations in the build, the first is for mvn install and the second is for mvn deploy

[GitHub] spark pull request: [HOTFIX]: Minor clean up regarding skipped art...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4078#issuecomment-70332121 @vanzin or @srowen - mind taking a quick look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-5289]: Backport publishing of repl, yar...

2015-01-16 Thread pwendell
GitHub user pwendell opened a pull request: https://github.com/apache/spark/pull/4079 [SPARK-5289]: Backport publishing of repl, yarn into branch-1.2. This change was done in SPARK-4048 as part of a larger refactoring, but we need to backport this publishing of yarn and repl

[GitHub] spark pull request: [HOTFIX]: Minor clean up regarding skipped art...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4078#issuecomment-70334225 I think it is likely still used by companies that publish Spark via the traditional route, either internally or for forks of Spark. For the upstream release publishing

[GitHub] spark pull request: [HOTFIX]: Minor clean up regarding skipped art...

2015-01-16 Thread pwendell
Github user pwendell closed the pull request at: https://github.com/apache/spark/pull/4078 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [HOTFIX]: Minor clean up regarding skipped art...

2015-01-16 Thread pwendell
GitHub user pwendell opened a pull request: https://github.com/apache/spark/pull/4080 [HOTFIX]: Minor clean up regarding skipped artifacts in build files. There are two relevant 'skip' configurations in the build, the first is for mvn install and the second is for mvn deploy

[GitHub] spark pull request: [HOTFIX]: Minor clean up regarding skipped art...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4080#issuecomment-70334878 @srowen my comment on the old one was that, since companies probably still use the traditional deploy pattern internally (e.g. they aren't doing anything fancy to cross

[GitHub] spark pull request: [HOTFIX]: Minor clean up regarding skipped art...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4080#issuecomment-70334937 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [HOTFIX]: Minor clean up regarding skipped art...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4080#issuecomment-70339584 @vanzin yeah I think it was fine that you removed the install. We wanted to do that anyways because some people requested that we continue to publish those artifacts

[GitHub] spark pull request: [HOTFIX]: Minor clean up regarding skipped art...

2015-01-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4080#issuecomment-70339633 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3694] RDD and Task serialization debugg...

2015-01-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3518#discussion_r23050901 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -827,9 +868,21 @@ class DAGScheduler( // might modify state

[GitHub] spark pull request: [SPARK-3694] RDD and Task serialization debugg...

2015-01-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3518#discussion_r23051146 --- Diff: core/src/main/scala/org/apache/spark/util/ObjectWalker.scala --- @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-3694] RDD and Task serialization debugg...

2015-01-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3518#discussion_r23050776 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -789,6 +792,44 @@ class DAGScheduler

[GitHub] spark pull request: [SPARK-3694] RDD and Task serialization debugg...

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3518#issuecomment-70179454 Hey just took a quick pass with some code style suggestions (more coming) and usability suggestions. One thing, would it be possible to track the name of the fields you

[GitHub] spark pull request: [SPARK-3694] RDD and Task serialization debugg...

2015-01-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3518#discussion_r23051221 --- Diff: core/src/main/scala/org/apache/spark/util/ObjectWalker.scala --- @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request: [SPARK-3694] RDD and Task serialization debugg...

2015-01-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3518#discussion_r23051282 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -459,7 +459,23 @@ private[spark] class TaskSetManager

[GitHub] spark pull request: [SPARK-3694] RDD and Task serialization debugg...

2015-01-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3518#discussion_r23051021 --- Diff: core/src/main/scala/org/apache/spark/util/SerializationHelper.scala --- @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: SPARK-2630 Input data size of CoalescedRDD cou...

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2310#issuecomment-70212481 @ash211 I think we can close this issue now that we have merged #3120 --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: SPARK-4746 make it easy to skip IntegrationTes...

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4048#issuecomment-70212728 @squito another thing is that we should look at whether these tests really need to be integration style tests or not. I've seen people often use `local-cluster` because

[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70214406 I'm not sure this can be merged as-is. The state clean-up here is based on the assumption that every stage that is pending will at some later time be submitted

[GitHub] spark pull request: [SPARK-4923][REPL] Add Developer API to REPL t...

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4034#issuecomment-70214681 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70219039 Ah I see - so on the first point, the issue may be covered by the code you referenced. For some reason the diff originally rendered in a way where I didn't notice

[GitHub] spark pull request: [Minor] Fix tiny typo in BlockManager

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4046#issuecomment-70192913 Thanks guys - I merged it in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-4923][REPL] Add Developer API to REPL t...

2015-01-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4034#discussion_r23059280 --- Diff: repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkCommandLine.scala --- @@ -23,10 +23,15 @@ import scala.Predef._ /** * Command

[GitHub] spark pull request: [SPARK-5249] Added type specific set functions...

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4042#issuecomment-70197264 Yes, we could just add new ones. For this feature though, I guess I don't see why users cant' just call `toString` explicitly. --- If your project is set up

[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3120#issuecomment-70192150 LGTM pending tests --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3120#issuecomment-70192111 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4020#discussion_r23060808 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -257,8 +257,8 @@ private[spark] class Executor( val serviceTime

[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70201497 Conflicts abound! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70192435 Btw - this may conflict with #3120 so maybe we should hold off until that's merge. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: SPARK-4746 make it easy to skip IntegrationTes...

2015-01-14 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4048#issuecomment-69991447 Just curious - what is the before and after time? I.e. what fraction of time does this cut down on? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...

2015-01-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3120#discussion_r22777883 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -44,7 +44,14 @@ private[spark] class CacheManager(blockManager: BlockManager

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2015-01-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3711#discussion_r22777939 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala --- @@ -344,12 +354,20 @@ private[spark] class

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2015-01-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3711#issuecomment-69538284 LGTM - just had a minor comment that can also be addressed on merge. Jenkins, test this please. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...

2015-01-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3120#discussion_r22778368 --- Diff: core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala --- @@ -153,34 +157,19 @@ class NewHadoopRDD[K, V]( throw new

[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...

2015-01-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3120#discussion_r22778546 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -44,7 +44,14 @@ private[spark] class CacheManager(blockManager: BlockManager

[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...

2015-01-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3120#discussion_r22778563 --- Diff: core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala --- @@ -109,18 +109,22 @@ class NewHadoopRDD[K, V]( logInfo(Input split

[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

2015-01-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3645#issuecomment-69635091 Yeah we've also seen this issue in docker environments. There is an alternative solution we just merged that allows overriding the reverse DNS lookup - and in our

[GitHub] spark pull request: [WebUI] Fix collapse of WebUI layout

2015-01-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3995#issuecomment-69624287 @sarutak can you give a screenshot of what it looks like before your patch (or after?). I can't tell whether the one here is before or after. --- If your project

[GitHub] spark pull request: [SPARK-5078] Optionally read from SPARK_LOCAL_...

2015-01-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3893#issuecomment-69634618 Okay let's pull this in for now. Don't want to block this on `SPARK-5113`. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...

2015-01-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3645#issuecomment-69635199 BTW - I also created this JIRA to try and clean up the way we deal with binding and advertised hostnames in Spark: https://issues.apache.org/jira/browse/SPARK

[GitHub] spark pull request: [SPARK-5061][SQL] SQLContext: overload createP...

2015-01-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3957#issuecomment-69625129 Let's close this issue to keep the queue clean, pending any new discussion. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...

2015-01-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3120#discussion_r22814472 --- Diff: core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala --- @@ -153,34 +157,19 @@ class NewHadoopRDD[K, V]( throw new

[GitHub] spark pull request: Added --package argument to make-distributio...

2015-01-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3682#issuecomment-69637083 Hey @jmschrack - I'm not sure we want to allow packaging without building in this script. This allows for some confusing outcomes, such as a user compiles Spark for one

[GitHub] spark pull request: SPARK-5172 [BUILD] spark-examples-***.jar shad...

2015-01-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3992#issuecomment-69637599 Thanks all, I'll pull this in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2015-01-12 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3711#issuecomment-69623494 Actually, why not add these in the DAG Scheduler per @nitin2goyal's suggestion. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2015-01-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3711#discussion_r22811962 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -213,6 +216,7 @@ class

[GitHub] spark pull request: [SPARK-5200] Disable web UI in Hive ThriftServ...

2015-01-11 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3998#issuecomment-69534258 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...

2015-01-11 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3120#issuecomment-69535751 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-5032] [graphx] Remove GraphX MIMA exclu...

2015-01-10 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3856#issuecomment-69479694 Okay great - I'm merging this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-4014] Add TaskContext.attemptNumber and...

2015-01-10 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3849#issuecomment-69486253 @joshrosen LGTM relating the renaming. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-5073] spark.storage.memoryMapThreshold ...

2015-01-09 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3900#issuecomment-69369806 Can you update the configuration page as well? `docs/configuration.md`: http://spark.apache.org/docs/latest/configuration.html --- If your project is set up

[GitHub] spark pull request: SPARK-5136 [DOCS] Improve documentation around...

2015-01-09 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3952#issuecomment-69368687 Looks great, thanks Sena. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-5163] [CORE] Load properties from confi...

2015-01-09 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3963#issuecomment-69370660 I'd prefer not to accept this patch for now - the `spark-defaults.conf` concept was isolated to Spark submit intentionally to keep things simple. It's easy for users

[GitHub] spark pull request: HOTFIX: Minor improvements to make-distributio...

2015-01-09 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3973#issuecomment-69369351 Okay pulling this into master. These are just minor clean-up's to this internal script. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-5138][SQL] Ensure schema can be inferre...

2015-01-09 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3978#issuecomment-69369627 /cc @davies --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-1143] Separate pool tests into their ow...

2015-01-09 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3967#issuecomment-69370309 Looks good, 'Ill pull it in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-5163] [CORE] Load properties from confi...

2015-01-09 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3963#issuecomment-69370673 close this issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-4048] Enhance and extend hadoop-provide...

2015-01-08 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2982#issuecomment-69279421 Thanks @vanzin for this cleaup - I'll merge this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-5136 [DOCS] Improve documentation around...

2015-01-08 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3952#discussion_r22703171 --- Diff: docs/building-spark.md --- @@ -153,7 +153,8 @@ Thus, the full flow for running continuous-compilation of the `core` submodule m # Using

[GitHub] spark pull request: HOTFIX: Minor improvements to make-distributio...

2015-01-08 Thread pwendell
GitHub user pwendell opened a pull request: https://github.com/apache/spark/pull/3973 HOTFIX: Minor improvements to make-distribution.sh 1. Renames $FWDIR to $SPARK_HOME (vast majority of diff). 2. Use Spark-provided Maven. 3. Logs build flags in the RELEASE file. You can

[GitHub] spark pull request: [SPARK-4048] Enhance and extend hadoop-provide...

2015-01-08 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2982#discussion_r22691185 --- Diff: pom.xml --- @@ -980,15 +1142,17 @@ reportsDirectory${project.build.directory}/surefire-reports/reportsDirectory

[GitHub] spark pull request: [SPARK-4048] Enhance and extend hadoop-provide...

2015-01-08 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2982#discussion_r2269 --- Diff: yarn/pom.xml --- @@ -131,13 +131,6 @@ skiptrue/skip /configuration /plugin - plugin

[GitHub] spark pull request: [SPARK-4048] Enhance and extend hadoop-provide...

2015-01-08 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2982#discussion_r22672500 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -990,11 +990,19 @@ private[spark] object Utils extends Logging { for ((key

[GitHub] spark pull request: [SPARK-5032] [graphx] Remove GraphX MIMA exclu...

2015-01-08 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3856#issuecomment-69245017 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-4687. [WIP] Add an addDirectory API

2015-01-08 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3670#issuecomment-69246607 I looked around a bit to try and find Java libraries that provide a recursive copy option. It turns out most of them don't - both Guava and Java 7's new file utilities

[GitHub] spark pull request: [SPARK-4387][PySpark] Refactoring python profi...

2015-01-08 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3255#issuecomment-69247079 @udnay this does not merge cleanly. Can you rebase it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-4660: Use correct default classloader in...

2015-01-08 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3840#issuecomment-69247743 @pkolaczk can you make a version of this pull request against the master branch? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2015-01-08 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3711#discussion_r22679008 --- Diff: core/src/test/scala/org/apache/spark/scheduler/EventLoggingListenerSuite.scala --- @@ -160,7 +160,7 @@ class EventLoggingListenerSuite extends

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2015-01-08 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3711#discussion_r22679665 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala --- @@ -344,12 +354,20 @@ private[spark] class

[GitHub] spark pull request: SPARK-4687. [WIP] Add an addDirectory API

2015-01-08 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3670#issuecomment-69246069 Yea my thoughts for `recursive` were to match the semantics of `cp` which is the main file copying interface I interact with. Calling `cp` on a directory without

<    6   7   8   9   10   11   12   13   14   15   >