[GitHub] spark pull request: [SPARK-14358] Change SparkListener from a trai...

2016-04-03 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/12142#issuecomment-205149044 Just that minor question but otherwise LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14358] Change SparkListener from a trai...

2016-04-03 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/12142#discussion_r58330267 --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala --- @@ -249,6 +250,8 @@ trait SparkListener { * Called when other

[GitHub] spark pull request: [WIP]Remove streaming-flume, streaming-mqtt, s...

2016-03-12 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/11672#issuecomment-195850156 I would also like to see the Kafka modules removed in a similar way. We have had trouble balancing Spark's compatibility requirements and Kafka's every

[GitHub] spark pull request: [SPARK-10088] [sql] Add support for "stored as...

2015-08-18 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/8282#issuecomment-132320867 Classic copypasta. LTGM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-6325] [core,yarn] Do not change target ...

2015-03-16 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/5018#issuecomment-81949164 I'm not very familiar with the code but it looks like a pretty sane change from what I can tell. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-6325] [core,yarn] Do not change target ...

2015-03-16 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/5018#discussion_r26529056 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -340,7 +341,11 @@ class

[GitHub] spark pull request: [SPARK-6183][Deploy] Skip bad workers when re-...

2015-03-06 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4909#discussion_r25964901 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -467,7 +467,9 @@ private[spark] class Master( * two executors on

[GitHub] spark pull request: SPARK-2450 Adds executor log links to Web UI

2015-03-04 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3486#discussion_r25843057 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala --- @@ -362,11 +362,13 @@ private[spark] class Worker( self

[GitHub] spark pull request: [SPARK-5982] Remove incorrect Local Read Time ...

2015-02-24 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4749#issuecomment-75872068 This LGTM. Curious, have you thought about how to properly capture this metric? I haven't given this any serious thought but my first idea is that need to

[GitHub] spark pull request: [SPARK-5227] [SPARK-5679] Disable FileSystem c...

2015-02-13 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4599#issuecomment-74352451 Ok, this LTGM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-5227] [SPARK-5679] Disable FileSystem c...

2015-02-13 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4599#discussion_r24707286 --- Diff: core/src/test/scala/org/apache/spark/input/WholeTextFileRecordReaderSuite.scala --- @@ -42,7 +42,15 @@ class WholeTextFileRecordReaderSuite

[GitHub] spark pull request: [SPARK-5735] Replace uses of EasyMock with Moc...

2015-02-12 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4578#issuecomment-74169618 I was meaning to do this at some point. Thanks Josh. LGTM! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-5762] Fix shuffle write time for sort-b...

2015-02-12 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4559#issuecomment-74130013 LGTM, @sryza I think its okay to include the reading time since this whole operation you can argue is for writing out the file - even if there is some reading

[GitHub] spark pull request: [SPARK-5762] Fix shuffle write time for sort-b...

2015-02-12 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4559#discussion_r24608462 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala --- @@ -740,6 +741,8 @@ private[spark] class ExternalSorter[K, V, C

[GitHub] spark pull request: [SPARK-5645] Added local read bytes/time to ta...

2015-02-10 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4510#issuecomment-73812744 Thanks for doing this! Thats one less thing for me to do. Overall this LGTM aside from the small little comments I made. --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-5645] Added local read bytes/time to ta...

2015-02-10 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4510#discussion_r24464381 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -344,6 +346,20 @@ class ShuffleReadMetrics extends Serializable

[GitHub] spark pull request: [SPARK-5645] Added local read bytes/time to ta...

2015-02-10 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4510#discussion_r24464185 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala --- @@ -181,7 +188,8 @@ private[ui] class StagePage(parent: StagesTab) extends

[GitHub] spark pull request: [SPARK-5645] Added local read bytes/time to ta...

2015-02-10 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4510#discussion_r24450007 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -189,6 +189,7 @@ final class ShuffleBlockFetcherIterator

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-08 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-73429206 Yikes, @JoshRosen i'm looking into this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-5347][CORE] Change FileSplit to InputSp...

2015-02-06 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4150#issuecomment-73275084 I agree with @srowen and @sryza. Also given #4067 this metric should really just report size --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-5653][YARN] In ApplicationMaster rename...

2015-02-06 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4430#issuecomment-73273181 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-5636] Ramp up faster in dynamic allocat...

2015-02-05 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4409#issuecomment-73187911 one small comment thing but other than that LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-5636] Ramp up faster in dynamic allocat...

2015-02-05 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4409#discussion_r24224468 --- Diff: core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala --- @@ -78,7 +78,7 @@ private[spark] class ExecutorAllocationManager

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-05 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r24223601 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -358,5 +374,12 @@ class ShuffleWriteMetrics extends Serializable

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-05 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r24223586 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala --- @@ -193,12 +194,11 @@ private[spark] class DiskBlockObjectWriter

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-05 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-73182058 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-05 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-73164133 New screenshots with the irrelevant columns invisible: ![screen shot 2015-02-05 at 5 07 23 pm](https://cloud.githubusercontent.com/assets/6590087/6072770

[GitHub] spark pull request: SPARK-5557: Explicitly include servlet API in ...

2015-02-05 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4411#issuecomment-73162789 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-05 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r24207220 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala --- @@ -472,12 +512,12 @@ private[ui] class StagePage(parent: StagesTab) extends

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-05 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r24206958 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala --- @@ -193,12 +194,11 @@ private[spark] class DiskBlockObjectWriter

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-05 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r24205171 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -1089,17 +1091,15 @@ class PairRDDFunctions[K, V](self: RDD[(K, V

[GitHub] spark pull request: SPARK-2450 Adds executor log links to Web UI

2015-02-04 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3486#discussion_r24131025 --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala --- @@ -382,7 +382,8 @@ private[spark] object JsonProtocol { def

[GitHub] spark pull request: SPARK-2450 Adds executor log links to Web UI

2015-02-04 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3486#issuecomment-72946588 @JoshRosen Thanks for doing that. I merged the pr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-02 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r23953114 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala --- @@ -472,12 +512,12 @@ private[ui] class StagePage(parent: StagesTab) extends

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-02 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r23952709 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -47,9 +49,13 @@ private[spark] class CacheManager(blockManager: BlockManager

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-02 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r23950220 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -47,9 +49,13 @@ private[spark] class CacheManager(blockManager: BlockManager

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-02 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-72498987 @pwendell I'm not sure how we can do what you propose without having an O(n) loop through all the records before passing the InterruptableIterator? We cou

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-02 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r23940117 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -47,9 +49,13 @@ private[spark] class CacheManager(blockManager: BlockManager

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-01-30 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-72292815 @pwendell can you please re-review this? I'd like to get it in to 1.3. Some of our customers have been asking for metrics to help them determine data skew.

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-01-30 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-72292701 New Screenshot that correspond to CR feedback. ![screen shot 2015-01-30 at 7 07 48 pm](https://cloud.githubusercontent.com/assets/6590087/5985688/871415e4-a8b3

[GitHub] spark pull request: SPARK-2450 Adds executor log links to Web UI

2015-01-27 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3486#issuecomment-71739273 @andrewor14 @JoshRosen Ping. Can you guys review this. I'd like to get it in Spark 1.3 --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: SPARK-5393. Flood of util.RackResolver log mes...

2015-01-27 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4192#discussion_r23629409 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala --- @@ -60,6 +62,9 @@ private[yarn] class YarnAllocator( import

[GitHub] spark pull request: SPARK-4585. Spark dynamic executor allocation ...

2015-01-26 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4051#issuecomment-71558537 LGTM I agree with @sryza that maxExecutors should default to Integer.MAX_VALUE. It is not the app's responsibility to play fair but rather to ask the res

[GitHub] spark pull request: SPARK-4585. Spark dynamic executor allocation ...

2015-01-26 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4051#discussion_r23571627 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala --- @@ -75,14 +75,23 @@ private[spark] class ClientArguments(args: Array

[GitHub] spark pull request: SPARK-4585. Spark dynamic executor allocation ...

2015-01-26 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4051#discussion_r23571472 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala --- @@ -75,14 +75,23 @@ private[spark] class ClientArguments(args: Array

[GitHub] spark pull request: SPARK-5393. Flood of util.RackResolver log mes...

2015-01-26 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4192#discussion_r23569025 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala --- @@ -60,6 +62,9 @@ private[yarn] class YarnAllocator( import

[GitHub] spark pull request: [SPARK-5291][CORE] Add timestamp and reason wh...

2015-01-23 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4082#issuecomment-71282577 LGTM - nice addition. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-5190] Allow SparkListeners to be regist...

2015-01-21 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4111#discussion_r23334796 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -379,6 +379,41 @@ class SparkContext(config: SparkConf) extends Logging with

[GitHub] spark pull request: [SPARK-5190] Allow SparkListeners to be regist...

2015-01-21 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4111#discussion_r2878 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -379,6 +379,41 @@ class SparkContext(config: SparkConf) extends Logging with

[GitHub] spark pull request: [SPARK-5190] Allow SparkListeners to be regist...

2015-01-21 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4111#discussion_r2304 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -379,6 +379,41 @@ class SparkContext(config: SparkConf) extends Logging with

[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-20 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4020#issuecomment-70740442 @pwendell yeah, looks good. Sorry I was also away. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-5190] Allow SparkListeners to be regist...

2015-01-20 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4111#issuecomment-70737041 @pwendell I think using configuration to add spark listeners adds complexity to the system. 1) Can't do static checking of listeners at compile time 2

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-01-16 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70301522 A big motivation to add recordsRead/Written was to detect data skew. In these cases bytes and records might not track very closely. Thinking more about this, I

[GitHub] spark pull request: SPARK-4585. Spark dynamic executor allocation ...

2015-01-16 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4051#issuecomment-70297739 I'd prefer we don't introduce yet another config option for dynamic scaling. How much delay are we really introducing here? The number of running executors w

[GitHub] spark pull request: SPARK-5270 [CORE] Elegantly check if RDD is em...

2015-01-16 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4074#issuecomment-70296530 LTGM. What is the use case? is this part of a bigger pr? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-01-16 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70230969 If we do that we wouldn't be able to sort on num records and bytes independently. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: SPARK-2450 Adds exeuctor log links to Web UI

2015-01-16 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3486#issuecomment-70221523 @JoshRosen Now that #3696, then #3711 are merged, I rebased, fixed the merge issues and so I think this pr is ready once again to be reviewed. --- If your project

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-01-16 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70220867 ![screen shot 2015-01-16 at 12 04 04 am](https://cloud.githubusercontent.com/assets/6590087/5773199/bcd70ffc-9d13-11e4-86c1-7140a73f9e92.png) Shows a stage that has

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-01-15 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70219531 So is this code you were referring to in HadoopRDD? ```scala // Find a function that will return the FileSystem bytes read by this thread. Do this before

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-01-15 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70218640 @rxin I updated the PR after doing a rebase and also incorporated some of your feedback. You made two general comments: 1) specific implementations might've g

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-01-15 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r23066295 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -213,18 +213,19 @@ class HadoopRDD[K, V]( logInfo("Input

[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...

2015-01-15 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3120#discussion_r23066065 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -146,6 +185,10 @@ class TaskMetrics extends Serializable

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-01-15 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70215626 This change was dependent on https://github.com/apache/spark/pull/3120, that just got merged and now there are some merge conflicts. I need to fix those first and will

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-01-15 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r23065399 --- Diff: core/src/main/scala/org/apache/spark/util/interceptingIterator.scala --- @@ -0,0 +1,82 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-01-15 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4067#discussion_r23065373 --- Diff: core/src/main/scala/org/apache/spark/shuffle/hash/BlockStoreShuffleFetcher.scala --- @@ -82,7 +82,16 @@ private[hash] object

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-01-15 Thread ksakellis
GitHub user ksakellis opened a pull request: https://github.com/apache/spark/pull/4067 [SPARK-4874] [CORE] Collect record count metrics Collects record counts for both Input/Output and Shuffle Metrics. For the input/output metrics, it just appends the counter every time the

[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4020#discussion_r23058543 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -240,10 +284,18 @@ class ShuffleWriteMetrics extends Serializable

[GitHub] spark pull request: [SPARK-3288] All fields in TaskMetrics should ...

2015-01-15 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4020#discussion_r23056809 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -240,10 +284,18 @@ class ShuffleWriteMetrics extends Serializable

[GitHub] spark pull request: SPARK-5199. Input metrics should show up for I...

2015-01-14 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4050#discussion_r22985033 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -219,6 +220,9 @@ class HadoopRDD[K, V]( val bytesReadCallback = if

[GitHub] spark pull request: SPARK-4585. Spark dynamic executor allocation ...

2015-01-14 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4051#discussion_r22972821 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala --- @@ -73,12 +73,12 @@ private[spark] class ClientArguments(args: Array

[GitHub] spark pull request: SPARK-4585. Spark dynamic executor allocation ...

2015-01-14 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4051#discussion_r22972319 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala --- @@ -73,12 +73,12 @@ private[spark] class ClientArguments(args: Array

[GitHub] spark pull request: SPARK-5199. Input metrics should show up for I...

2015-01-14 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4050#issuecomment-69996188 This mostly LGTM. My only concern is with the proliferation of copy pasta between the HadoopRDD and NewHadoopRDD. --- If your project is set up for it, you can reply

[GitHub] spark pull request: SPARK-5199. Input metrics should show up for I...

2015-01-14 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/4050#discussion_r22971793 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -219,6 +220,9 @@ class HadoopRDD[K, V]( val bytesReadCallback = if

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2015-01-13 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3711#issuecomment-69804537 @pwendell @nitin2goyal I had previously thought of adding it to the DAGScheduler - this would avoid the duplicated code between the CoarseGrainedSchedulerBackend and

[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...

2015-01-12 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3120#discussion_r22822258 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -44,7 +44,14 @@ private[spark] class CacheManager(blockManager: BlockManager

[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...

2015-01-12 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3120#discussion_r22807972 --- Diff: core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala --- @@ -153,34 +157,19 @@ class NewHadoopRDD[K, V]( throw new

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2015-01-08 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3711#discussion_r22681441 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala --- @@ -344,12 +354,20 @@ private[spark] class

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2015-01-08 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3711#discussion_r22680915 --- Diff: core/src/test/scala/org/apache/spark/scheduler/EventLoggingListenerSuite.scala --- @@ -160,7 +160,7 @@ class EventLoggingListenerSuite extends

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2015-01-07 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3711#discussion_r22624775 --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala --- @@ -84,6 +85,14 @@ case class SparkListenerBlockManagerRemoved(time: Long

[GitHub] spark pull request: SPARK-4843 [YARN] Squash ExecutorRunnableUtil ...

2015-01-05 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3696#issuecomment-68813105 @JoshRosen yup, moved lazy val env = prepareEnvironment to be last statement so it is the same as before. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2015-01-05 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3711#issuecomment-68788875 Yes it is. Not sure why I changed the #of cores between the two commits in the unit test - weird. Anyways. it has been fixed. --- If your project is set up for it

[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...

2015-01-05 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3120#issuecomment-68778832 @kayousterhout @pwendell ping? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2015-01-05 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3711#issuecomment-68778757 @JoshRosen I actually couldn't make ExecutorInfo (now ExecutorDetails) a trait because the json protocol stuff requires a concrete object to deserialize events

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2014-12-30 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3711#discussion_r22366788 --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala --- @@ -84,6 +86,14 @@ case class SparkListenerBlockManagerRemoved(time: Long

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2014-12-30 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3711#discussion_r22366699 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorInfo.scala --- @@ -0,0 +1,29 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2014-12-22 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3711#discussion_r22195578 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorInfo.scala --- @@ -0,0 +1,29 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...

2014-12-16 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3120#issuecomment-67213648 @kayousterhout The test you pointed out: sc.parallelize(1 to 2).saveAsTextFile("file:tester1") val a = sc.textFile("file:tester1")

[GitHub] spark pull request: [SPARK-4079] [CORE] Consolidates Errors if a C...

2014-12-16 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3119#discussion_r21922602 --- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala --- @@ -120,6 +129,12 @@ class LZFCompressionCodec(conf: SparkConf) extends

[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...

2014-12-16 Thread ksakellis
GitHub user ksakellis opened a pull request: https://github.com/apache/spark/pull/3711 [SPARK-4857] [CORE] Adds Executor membership events to SparkListener Adds onExecutorAdded and onExecutorRemoved events to the SparkListener. This will allow a client to get notified when an

[GitHub] spark pull request: SPARK-4843 [YARN] Squash ExecutorRunnableUtil ...

2014-12-14 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3696#issuecomment-66958311 Hmm.. tests failed but I'm not sure they are related to this change. Am I missing something? --- If your project is set up for it, you can reply to this emai

[GitHub] spark pull request: SPARK-4843 [YARN] Squash ExecutorRunnableUtil ...

2014-12-14 Thread ksakellis
GitHub user ksakellis opened a pull request: https://github.com/apache/spark/pull/3696 SPARK-4843 [YARN] Squash ExecutorRunnableUtil and ExecutorRunnable ExecutorRunnableUtil is a parent of ExecutorRunnable because of the yarn-alpha and yarn-stable split. Now that yarn-alpha is

[GitHub] spark pull request: [SPARK-4754] Refactor SparkContext into Execut...

2014-12-11 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3614#issuecomment-66696814 Just curious here, but is the eventual goal to pull this functionality(adding/removing executors) out of the SparkContext object? --- If your project is set up for

[GitHub] spark pull request: [WIP] SPARK-2450 Adds exeuctor log links to We...

2014-12-09 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3486#issuecomment-66376138 @andrewor14 The reason for this approach is that it is really not yarn specific. I was envisioning adding support for standalone mode this way too. This is why the

[GitHub] spark pull request: [SPARK-4765] Make GC time always shown in UI.

2014-12-09 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3622#issuecomment-66360558 @kayousterhout I ran into this issue with a private case class too on: https://github.com/apache/spark/pull/3486 and added exclusions. I can try adding private[spark

[GitHub] spark pull request: [WIP] SPARK-2450 Adds exeuctor log links to We...

2014-12-09 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3486#issuecomment-66356776 @andrewor14 Yes you can start doing a review. I wanted to post this just to get sign off on the strategy i'm using. If people feel good about this, i'll post

[GitHub] spark pull request: [SPARK-4774] Makes HiveFromSpark more portable

2014-12-05 Thread ksakellis
GitHub user ksakellis opened a pull request: https://github.com/apache/spark/pull/3628 [SPARK-4774] Makes HiveFromSpark more portable HiveFromSpark read the kv1.txt file from SPARK_HOME/examples/src/main/resources/kv1.txt which assumed you had a source tree checked out. Now we

[GitHub] spark pull request: [WIP] SPARK-2450 Adds exeuctor log links to We...

2014-12-01 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3486#discussion_r21134086 --- Diff: core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala --- @@ -50,10 +50,16 @@ private[spark] class

[GitHub] spark pull request: [SPARK-4079] [CORE] Consolidates Errors if a C...

2014-12-01 Thread ksakellis
Github user ksakellis commented on a diff in the pull request: https://github.com/apache/spark/pull/3119#discussion_r2000 --- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala --- @@ -42,27 +43,37 @@ trait CompressionCodec { def compressedOutputStream

[GitHub] spark pull request: [WIP] SPARK-2450 Adds exeuctor log links to We...

2014-11-26 Thread ksakellis
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3486#issuecomment-64718983 ![screen shot 2014-11-26 at 2 17 12 pm](https://cloud.githubusercontent.com/assets/6590087/5209741/1829e182-7577-11e4-9c88-555b0ca7b671.png) --- If your project is

[GitHub] spark pull request: [WIP] SPARK-2450 Adds exeuctor log links to We...

2014-11-26 Thread ksakellis
GitHub user ksakellis opened a pull request: https://github.com/apache/spark/pull/3486 [WIP] SPARK-2450 Adds exeuctor log links to Web UI Creates a general way to add log links to the executor page in the web UI. This is achieved by: 1) Adding two new spark

  1   2   >