Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/12142#issuecomment-205149044
Just that minor question but otherwise LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/12142#discussion_r58330267
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala ---
@@ -249,6 +250,8 @@ trait SparkListener {
* Called when other
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/11672#issuecomment-195850156
I would also like to see the Kafka modules removed in a similar way. We
have had trouble balancing Spark's compatibility requirements and Kafka's every
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/8282#issuecomment-132320867
Classic copypasta. LTGM.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/5018#issuecomment-81949164
I'm not very familiar with the code but it looks like a pretty sane change
from what I can tell.
---
If your project is set up for it, you can reply to this
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/5018#discussion_r26529056
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
---
@@ -340,7 +341,11 @@ class
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4909#discussion_r25964901
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
---
@@ -467,7 +467,9 @@ private[spark] class Master(
* two executors on
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3486#discussion_r25843057
--- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
---
@@ -362,11 +362,13 @@ private[spark] class Worker(
self
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4749#issuecomment-75872068
This LGTM.
Curious, have you thought about how to properly capture this metric? I
haven't given this any serious thought but my first idea is that need to
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4599#issuecomment-74352451
Ok, this LTGM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4599#discussion_r24707286
--- Diff:
core/src/test/scala/org/apache/spark/input/WholeTextFileRecordReaderSuite.scala
---
@@ -42,7 +42,15 @@ class WholeTextFileRecordReaderSuite
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4578#issuecomment-74169618
I was meaning to do this at some point. Thanks Josh. LGTM!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4559#issuecomment-74130013
LGTM, @sryza I think its okay to include the reading time since this whole
operation you can argue is for writing out the file - even if there is some
reading
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4559#discussion_r24608462
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -740,6 +741,8 @@ private[spark] class ExternalSorter[K, V, C
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4510#issuecomment-73812744
Thanks for doing this! Thats one less thing for me to do. Overall this LGTM
aside from the small little comments I made.
---
If your project is set up for it, you
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4510#discussion_r24464381
--- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala
---
@@ -344,6 +346,20 @@ class ShuffleReadMetrics extends Serializable
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4510#discussion_r24464185
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala ---
@@ -181,7 +188,8 @@ private[ui] class StagePage(parent: StagesTab) extends
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4510#discussion_r24450007
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
---
@@ -189,6 +189,7 @@ final class ShuffleBlockFetcherIterator
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4067#issuecomment-73429206
Yikes, @JoshRosen i'm looking into this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4150#issuecomment-73275084
I agree with @srowen and @sryza. Also given #4067 this metric should really
just report size
---
If your project is set up for it, you can reply to this email and
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4430#issuecomment-73273181
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4409#issuecomment-73187911
one small comment thing but other than that LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4409#discussion_r24224468
--- Diff:
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -78,7 +78,7 @@ private[spark] class ExecutorAllocationManager
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4067#discussion_r24223601
--- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala
---
@@ -358,5 +374,12 @@ class ShuffleWriteMetrics extends Serializable
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4067#discussion_r24223586
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala ---
@@ -193,12 +194,11 @@ private[spark] class DiskBlockObjectWriter
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4067#issuecomment-73182058
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4067#issuecomment-73164133
New screenshots with the irrelevant columns invisible:
![screen shot 2015-02-05 at 5 07 23
pm](https://cloud.githubusercontent.com/assets/6590087/6072770
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4411#issuecomment-73162789
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4067#discussion_r24207220
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala ---
@@ -472,12 +512,12 @@ private[ui] class StagePage(parent: StagesTab)
extends
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4067#discussion_r24206958
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala ---
@@ -193,12 +194,11 @@ private[spark] class DiskBlockObjectWriter
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4067#discussion_r24205171
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -1089,17 +1091,15 @@ class PairRDDFunctions[K, V](self: RDD[(K, V
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3486#discussion_r24131025
--- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala ---
@@ -382,7 +382,8 @@ private[spark] object JsonProtocol {
def
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3486#issuecomment-72946588
@JoshRosen Thanks for doing that. I merged the pr
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4067#discussion_r23953114
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala ---
@@ -472,12 +512,12 @@ private[ui] class StagePage(parent: StagesTab)
extends
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4067#discussion_r23952709
--- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala ---
@@ -47,9 +49,13 @@ private[spark] class CacheManager(blockManager:
BlockManager
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4067#discussion_r23950220
--- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala ---
@@ -47,9 +49,13 @@ private[spark] class CacheManager(blockManager:
BlockManager
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4067#issuecomment-72498987
@pwendell I'm not sure how we can do what you propose without having an
O(n) loop through all the records before passing the InterruptableIterator? We
cou
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4067#discussion_r23940117
--- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala ---
@@ -47,9 +49,13 @@ private[spark] class CacheManager(blockManager:
BlockManager
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4067#issuecomment-72292815
@pwendell can you please re-review this? I'd like to get it in to 1.3. Some
of our customers have been asking for metrics to help them determine data skew.
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4067#issuecomment-72292701
New Screenshot that correspond to CR feedback.
![screen shot 2015-01-30 at 7 07 48
pm](https://cloud.githubusercontent.com/assets/6590087/5985688/871415e4-a8b3
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3486#issuecomment-71739273
@andrewor14 @JoshRosen Ping. Can you guys review this. I'd like to get it
in Spark 1.3
---
If your project is set up for it, you can reply to this email and have
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4192#discussion_r23629409
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala ---
@@ -60,6 +62,9 @@ private[yarn] class YarnAllocator(
import
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4051#issuecomment-71558537
LGTM
I agree with @sryza that maxExecutors should default to Integer.MAX_VALUE.
It is not the app's responsibility to play fair but rather to ask the res
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4051#discussion_r23571627
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala ---
@@ -75,14 +75,23 @@ private[spark] class ClientArguments(args:
Array
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4051#discussion_r23571472
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala ---
@@ -75,14 +75,23 @@ private[spark] class ClientArguments(args:
Array
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4192#discussion_r23569025
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala ---
@@ -60,6 +62,9 @@ private[yarn] class YarnAllocator(
import
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4082#issuecomment-71282577
LGTM - nice addition.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4111#discussion_r23334796
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -379,6 +379,41 @@ class SparkContext(config: SparkConf) extends Logging
with
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4111#discussion_r2878
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -379,6 +379,41 @@ class SparkContext(config: SparkConf) extends Logging
with
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4111#discussion_r2304
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -379,6 +379,41 @@ class SparkContext(config: SparkConf) extends Logging
with
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4020#issuecomment-70740442
@pwendell yeah, looks good. Sorry I was also away.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4111#issuecomment-70737041
@pwendell I think using configuration to add spark listeners adds
complexity to the system.
1) Can't do static checking of listeners at compile time
2
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4067#issuecomment-70301522
A big motivation to add recordsRead/Written was to detect data skew. In
these cases bytes and records might not track very closely.
Thinking more about this, I
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4051#issuecomment-70297739
I'd prefer we don't introduce yet another config option for dynamic
scaling. How much delay are we really introducing here? The number of running
executors w
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4074#issuecomment-70296530
LTGM. What is the use case? is this part of a bigger pr?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4067#issuecomment-70230969
If we do that we wouldn't be able to sort on num records and bytes
independently.
---
If your project is set up for it, you can reply to this email and have
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3486#issuecomment-70221523
@JoshRosen Now that #3696, then #3711 are merged, I rebased, fixed the
merge issues and so I think this pr is ready once again to be reviewed.
---
If your project
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4067#issuecomment-70220867
![screen shot 2015-01-16 at 12 04 04
am](https://cloud.githubusercontent.com/assets/6590087/5773199/bcd70ffc-9d13-11e4-86c1-7140a73f9e92.png)
Shows a stage that has
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4067#issuecomment-70219531
So is this code you were referring to in HadoopRDD?
```scala
// Find a function that will return the FileSystem bytes read by this
thread. Do this before
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4067#issuecomment-70218640
@rxin I updated the PR after doing a rebase and also incorporated some of
your feedback. You made two general comments:
1) specific implementations might've g
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4067#discussion_r23066295
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -213,18 +213,19 @@ class HadoopRDD[K, V](
logInfo("Input
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3120#discussion_r23066065
--- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala
---
@@ -146,6 +185,10 @@ class TaskMetrics extends Serializable
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4067#issuecomment-70215626
This change was dependent on https://github.com/apache/spark/pull/3120,
that just got merged and now there are some merge conflicts. I need to fix
those first and will
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4067#discussion_r23065399
--- Diff:
core/src/main/scala/org/apache/spark/util/interceptingIterator.scala ---
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4067#discussion_r23065373
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/hash/BlockStoreShuffleFetcher.scala
---
@@ -82,7 +82,16 @@ private[hash] object
GitHub user ksakellis opened a pull request:
https://github.com/apache/spark/pull/4067
[SPARK-4874] [CORE] Collect record count metrics
Collects record counts for both Input/Output and Shuffle Metrics. For the
input/output metrics, it just appends the counter every time the
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4020#discussion_r23058543
--- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala
---
@@ -240,10 +284,18 @@ class ShuffleWriteMetrics extends Serializable
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4020#discussion_r23056809
--- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala
---
@@ -240,10 +284,18 @@ class ShuffleWriteMetrics extends Serializable
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4050#discussion_r22985033
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -219,6 +220,9 @@ class HadoopRDD[K, V](
val bytesReadCallback = if
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4051#discussion_r22972821
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala ---
@@ -73,12 +73,12 @@ private[spark] class ClientArguments(args:
Array
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4051#discussion_r22972319
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala ---
@@ -73,12 +73,12 @@ private[spark] class ClientArguments(args:
Array
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/4050#issuecomment-69996188
This mostly LGTM. My only concern is with the proliferation of copy pasta
between the HadoopRDD and NewHadoopRDD.
---
If your project is set up for it, you can reply
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/4050#discussion_r22971793
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -219,6 +220,9 @@ class HadoopRDD[K, V](
val bytesReadCallback = if
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3711#issuecomment-69804537
@pwendell @nitin2goyal I had previously thought of adding it to the
DAGScheduler - this would avoid the duplicated code between the
CoarseGrainedSchedulerBackend and
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3120#discussion_r22822258
--- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala ---
@@ -44,7 +44,14 @@ private[spark] class CacheManager(blockManager:
BlockManager
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3120#discussion_r22807972
--- Diff: core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala ---
@@ -153,34 +157,19 @@ class NewHadoopRDD[K, V](
throw new
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3711#discussion_r22681441
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala
---
@@ -344,12 +354,20 @@ private[spark] class
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3711#discussion_r22680915
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/EventLoggingListenerSuite.scala
---
@@ -160,7 +160,7 @@ class EventLoggingListenerSuite extends
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3711#discussion_r22624775
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala ---
@@ -84,6 +85,14 @@ case class SparkListenerBlockManagerRemoved(time: Long
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3696#issuecomment-68813105
@JoshRosen yup, moved lazy val env = prepareEnvironment to be last
statement so it is the same as before.
---
If your project is set up for it, you can reply to this
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3711#issuecomment-68788875
Yes it is. Not sure why I changed the #of cores between the two commits in
the unit test - weird. Anyways. it has been fixed.
---
If your project is set up for it
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3120#issuecomment-68778832
@kayousterhout @pwendell ping?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3711#issuecomment-68778757
@JoshRosen I actually couldn't make ExecutorInfo (now ExecutorDetails) a
trait because the json protocol stuff requires a concrete object to deserialize
events
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3711#discussion_r22366788
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala ---
@@ -84,6 +86,14 @@ case class SparkListenerBlockManagerRemoved(time: Long
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3711#discussion_r22366699
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorInfo.scala ---
@@ -0,0 +1,29 @@
+/*
+ * Licensed to the Apache Software
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3711#discussion_r22195578
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/ExecutorInfo.scala ---
@@ -0,0 +1,29 @@
+/*
+ * Licensed to the Apache Software
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3120#issuecomment-67213648
@kayousterhout The test you pointed out:
sc.parallelize(1 to 2).saveAsTextFile("file:tester1")
val a = sc.textFile("file:tester1")
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3119#discussion_r21922602
--- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala ---
@@ -120,6 +129,12 @@ class LZFCompressionCodec(conf: SparkConf) extends
GitHub user ksakellis opened a pull request:
https://github.com/apache/spark/pull/3711
[SPARK-4857] [CORE] Adds Executor membership events to SparkListener
Adds onExecutorAdded and onExecutorRemoved events to the SparkListener.
This will allow a client to get notified when an
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3696#issuecomment-66958311
Hmm.. tests failed but I'm not sure they are related to this change. Am I
missing something?
---
If your project is set up for it, you can reply to this emai
GitHub user ksakellis opened a pull request:
https://github.com/apache/spark/pull/3696
SPARK-4843 [YARN] Squash ExecutorRunnableUtil and ExecutorRunnable
ExecutorRunnableUtil is a parent of ExecutorRunnable because of the
yarn-alpha and yarn-stable split. Now that yarn-alpha is
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3614#issuecomment-66696814
Just curious here, but is the eventual goal to pull this
functionality(adding/removing executors) out of the SparkContext object?
---
If your project is set up for
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3486#issuecomment-66376138
@andrewor14 The reason for this approach is that it is really not yarn
specific. I was envisioning adding support for standalone mode this way too.
This is why the
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3622#issuecomment-66360558
@kayousterhout I ran into this issue with a private case class too on:
https://github.com/apache/spark/pull/3486 and added exclusions. I can try
adding private[spark
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3486#issuecomment-66356776
@andrewor14 Yes you can start doing a review. I wanted to post this just to
get sign off on the strategy i'm using. If people feel good about this, i'll
post
GitHub user ksakellis opened a pull request:
https://github.com/apache/spark/pull/3628
[SPARK-4774] Makes HiveFromSpark more portable
HiveFromSpark read the kv1.txt file from
SPARK_HOME/examples/src/main/resources/kv1.txt which assumed
you had a source tree checked out. Now we
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3486#discussion_r21134086
--- Diff:
core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
---
@@ -50,10 +50,16 @@ private[spark] class
Github user ksakellis commented on a diff in the pull request:
https://github.com/apache/spark/pull/3119#discussion_r2000
--- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala ---
@@ -42,27 +43,37 @@ trait CompressionCodec {
def compressedOutputStream
Github user ksakellis commented on the pull request:
https://github.com/apache/spark/pull/3486#issuecomment-64718983
![screen shot 2014-11-26 at 2 17 12
pm](https://cloud.githubusercontent.com/assets/6590087/5209741/1829e182-7577-11e4-9c88-555b0ca7b671.png)
---
If your project is
GitHub user ksakellis opened a pull request:
https://github.com/apache/spark/pull/3486
[WIP] SPARK-2450 Adds exeuctor log links to Web UI
Creates a general way to add log links to the executor
page in the web UI.
This is achieved by:
1) Adding two new spark
1 - 100 of 116 matches
Mail list logo