Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/2366#issuecomment-55472830
What happens when there is recomputation which results in same blockId
getting regenerated (unpersist followed by recomputation/persist or block drop
followed
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/2366#issuecomment-56293724
@tdas handling (1) deterministically will make (2) in line with what we
currently have.
And that should be sufficient imo.
(3) was not in context
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/2366#discussion_r17833363
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -787,31 +789,88 @@ private[spark] class BlockManager
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/2366#discussion_r17833383
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -787,31 +789,88 @@ private[spark] class BlockManager
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/2366#discussion_r17833419
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -787,31 +789,88 @@ private[spark] class BlockManager
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/2366#discussion_r17833483
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -787,31 +789,88 @@ private[spark] class BlockManager
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/2422#issuecomment-56373277
Is there an example of how this is going to be leveraged ?
The default case is the simple version delegating to existing spark - would
be good to see how this is used
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1486#issuecomment-56480392
Are we proposing to introduce hdfs caching tags/idioms directly into
TaskSetManager in this pr ?
That does not look right. We need to generalize this so that any rdd
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1486#issuecomment-56506066
@pwendell This is not hadoop RDD specific functionality - it is a general
requirement which can be leveraged by any RDD in spark - and hadoop RDD
currently happens
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/2366#discussion_r17927862
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -787,31 +790,111 @@ private[spark] class BlockManager
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/2366#issuecomment-56566367
@tdas In case I did not mention it before :-) this is definitely a great
improvement over what existed earlier !
I would love it if we could (sometime soon I hope
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-49804353
We saw a bunch of EOF Exceptions from SpillReader.
java.io.EOFException
at
java.io.ObjectInputStream$BlockDataInputStream.peekByte
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15259118
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,573 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15259190
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,573 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-49833579
I had pulled about 20 mins after I mailed you ...
I have elaborated on why this occurs inline in the code - we can ignore it
for now though, since it happens even
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15274240
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,649 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1541#issuecomment-49855865
Instead of a ConcurrentHashMap, we should actually move it to a disk backed
Map - the cleanup of this datastructure is painful - which it can become
extremely large
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1499#discussion_r15288486
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -0,0 +1,649 @@
+/*
+ * Licensed to the Apache Software
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-49949511
@mateiz The total memory overhead actually goes much higher than
num_streams right ?
It should be order of num_streams + num_values for this key.
For fairly
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50115353
Running tests with
export
SPARK_JAVA_OPTS=-Dspark.shuffle.manager=org.apache.spark.shuffle.sort.SortShuffleManager
causes :
'''
- sorting using mutable
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50115453
BTW, this is one of 5 failures from core.
I hope there are no merge issues though,
---
If your project is set up for it, you can reply to this email and have your
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50116492
ah, thanks ! rerunning with 9c29957.
cant pull the pr - and manual merge is painful, hence delays in testing :-)
---
If your project is set up for it, you can reply
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1580#issuecomment-50246267
Actually we have also seen this happen multiple times.
A few have them have been fixed, but not all have been identified.
For example, there is incorrect DCL
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1313#issuecomment-50257258
Since all process local tasks are also node, rack and any : we will incur
node local delay also.
On 27-Jul-2014 11:09 am, Matei Zaharia notificati...@github.com
GitHub user mridulm opened a pull request:
https://github.com/apache/spark/pull/1609
[SPARK-2532] WIP Consolidated shuffle fixes
Status of the PR
- [X] Cherry pick and merge changes from internal branch to spark master
- [X] Remove WIP comments and 2G branch references
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15442779
--- Diff:
core/src/test/scala/org/apache/spark/storage/DiskBlockObjectWriterSuite.scala
---
@@ -0,0 +1,296 @@
+/*
+ * Licensed to the Apache
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15448803
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -935,15 +941,22 @@ private[spark] object Utils extends Logging {
* Currently
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1609#issuecomment-50306648
Accidental close, apologies !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user mridulm closed the pull request at:
https://github.com/apache/spark/pull/1609
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1609#issuecomment-50306633
@witgo I did not understand the space issue : stylecheck seems to run fine.
Regarding the actual issues : the JIRA lists some of them - unfortunately
GitHub user mridulm reopened a pull request:
https://github.com/apache/spark/pull/1609
[SPARK-2532] WIP Consolidated shuffle fixes
Status of the PR
- [X] Cherry pick and merge changes from internal branch to spark master
- [X] Remove WIP comments and 2G branch references
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1609#issuecomment-50307155
Jenkins, test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15449433
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -947,6 +958,34 @@ private[spark] object Utils extends Logging
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1609#issuecomment-50455483
All pending fixes work be done.
I dont think there are any pieces missing in the merge from internal branch
to master.
Open for review, thanks !
---
If your
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1542#issuecomment-50488319
@pwendell @mateiz was this PR really merged into spark ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15537366
--- Diff:
core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala ---
@@ -40,7 +40,7 @@ private[spark] class JavaSerializationStream(out
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1542#issuecomment-50509704
That was super scarey ! Thanks for clarifying @aarondav
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15540734
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/hash/HashShuffleWriter.scala ---
@@ -116,8 +118,13 @@ class HashShuffleWriter[K, V](
private
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15540782
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/hash/HashShuffleWriter.scala ---
@@ -71,7 +72,8 @@ class HashShuffleWriter[K, V](
try
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15540934
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala ---
@@ -107,68 +109,296 @@ private[spark] class DiskBlockObjectWriter
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15541003
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala ---
@@ -107,68 +109,296 @@ private[spark] class DiskBlockObjectWriter
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15541065
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala ---
@@ -188,6 +425,39 @@ private[spark] class DiskBlockObjectWriter
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15541257
--- Diff:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockManager.scala ---
@@ -236,31 +241,61 @@ object ShuffleBlockManager {
new
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15541435
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala
---
@@ -353,26 +368,53 @@ class ExternalAppendOnlyMap[K, V, C
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15542308
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala
---
@@ -418,7 +459,25 @@ class ExternalAppendOnlyMap[K, V, C
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1609#issuecomment-50517754
I have added some comments to the PR in the hopes that it will aid in the
review.
I am sure it is still involved process inspite of this, so please do feel
free
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1499#issuecomment-50522967
@mateiz please refer to changes here :
https://github.com/apache/spark/pull/1609/files#diff-10
They should be relevant to this PR too
---
If your project is set up
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15565447
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala ---
@@ -107,68 +109,296 @@ private[spark] class DiskBlockObjectWriter
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15565486
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala ---
@@ -107,68 +109,296 @@ private[spark] class DiskBlockObjectWriter
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1609#discussion_r15565552
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -947,6 +958,34 @@ private[spark] object Utils extends Logging
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1678#discussion_r15682389
--- Diff:
core/src/main/scala/org/apache/spark/shuffle/hash/HashShuffleWriter.scala ---
@@ -120,8 +121,7 @@ private[spark] class HashShuffleWriter[K, V
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1678#discussion_r15682412
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala ---
@@ -147,28 +147,36 @@ private[spark] class DiskBlockObjectWriter
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1678#discussion_r15682457
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala ---
@@ -147,28 +147,36 @@ private[spark] class DiskBlockObjectWriter
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1678#discussion_r15683205
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala ---
@@ -147,28 +147,36 @@ private[spark] class DiskBlockObjectWriter
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1678#discussion_r15683224
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala ---
@@ -147,28 +147,36 @@ private[spark] class DiskBlockObjectWriter
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/855#issuecomment-50895949
This definitely is much better, thanks for the PR !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1678#discussion_r15701250
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockObjectWriter.scala ---
@@ -147,28 +147,36 @@ private[spark] class DiskBlockObjectWriter
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1486#discussion_r15725601
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -243,10 +244,23 @@ class HadoopRDD[K, V](
new
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1486#discussion_r15725610
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -216,6 +216,7 @@ abstract class RDD[T: ClassTag](
getPreferredLocations(split
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1722#discussion_r15725631
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala
---
@@ -215,16 +218,28 @@ class ExternalAppendOnlyMap[K, V, C
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1722#discussion_r15725641
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala
---
@@ -389,27 +404,51 @@ class ExternalAppendOnlyMap[K, V, C
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1722#discussion_r15725667
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala
---
@@ -389,27 +404,51 @@ class ExternalAppendOnlyMap[K, V, C
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1722#discussion_r15725700
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala
---
@@ -455,7 +495,25 @@ class ExternalAppendOnlyMap[K, V, C
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1722#discussion_r15725724
--- Diff:
core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
---
@@ -30,8 +30,19 @@ class ExternalAppendOnlyMapSuite
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1722#discussion_r15725858
--- Diff:
core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
---
@@ -30,8 +30,19 @@ class ExternalAppendOnlyMapSuite
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1525#discussion_r15725875
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
---
@@ -47,19 +47,19 @@ class
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1525#discussion_r15725931
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
---
@@ -47,19 +47,19 @@ class
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1722#discussion_r15728047
--- Diff:
core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
---
@@ -30,8 +30,19 @@ class ExternalAppendOnlyMapSuite
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1736#discussion_r15728056
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockManagerSource.scala ---
@@ -46,9 +46,8 @@ private[spark] class BlockManagerSource(val
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1736#discussion_r15732438
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockManagerSource.scala ---
@@ -46,9 +46,8 @@ private[spark] class BlockManagerSource(val
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1736#discussion_r15732470
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockManagerSource.scala ---
@@ -46,9 +46,8 @@ private[spark] class BlockManagerSource(val
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1722#issuecomment-50992153
LGTM, thanks Matei !
On 03-Aug-2014 12:13 pm, Matei Zaharia notificati...@github.com wrote:
@aarondav https://github.com/aarondav / @mridulm
https
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1722#issuecomment-50992283
Oh wait, is the java serialier change also ported ?
Else the tests won't do what we want it to do.
On 03-Aug-2014 8:11 pm, Mridul Muralidharan mri...@gmail.com
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1722#issuecomment-51003282
LGTM !
Though I would prefer if @aarondav also took a look at it - since this is
based on my earlier work, I might be too close to it to see potential issues
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1722#discussion_r15750212
--- Diff:
core/src/main/scala/org/apache/spark/serializer/JavaSerializer.scala ---
@@ -35,16 +35,15 @@ private[spark] class JavaSerializationStream(out
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1722#issuecomment-51047651
LGTM !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1780#issuecomment-51168402
IIRC if kryo cant host entire serialized object in the buffer, it throws up
: we saw issues with it being as high as 256 kb for some of our jobs : though
we were using
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1781#issuecomment-51169641
We are running this with 8k or so :-)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/1218#discussion_r15827428
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1531,18 +1532,6 @@ object SparkContext extends Logging {
throw new
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1780#issuecomment-51269524
Hi @pwendell, my observation about buffer size was not in context of spark
... we saw issues which looked like buffer overflow when the serialized
object graph was large
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1872#issuecomment-51710872
If case class then does it still need to be Externalizable ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1870#issuecomment-51850204
Just saw this as part of the close, sorry for the late comment.
Also, some of the INFO messages which are useful have now become DEBUG ?
Makes it slightly harder
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1896#issuecomment-51850469
It is just a modification of the test above it :-)
Maybe some copy paste error ? That line is not required for this test btw -
just the last line validates the issue
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1870#issuecomment-51850828
Unfortunately, in most cases, we wont know what the issue is other than bug
hunting in the logs.
So debug logging gets enabled for a wide swathe of packages
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1896#issuecomment-51852483
It has to do with FakeRackUtil used in that class.
I guess not all tests clean it up properly after assigning to it : which is
why host2 (or host1 depending on order
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1896#issuecomment-51852593
To be deterministic, you can add FakeRackUtil.cleanup at begining of this
test too.
Though ideally we should do it to tests which add hosts to rack
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/2729#issuecomment-58473957
At least for yarn, this will create issues if overridden from default.
Not sure about mesos.
Why not use std java property and define it for local
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/2729#issuecomment-58479810
There is a java property which controls this ... java.io.tmpdir
On 09-Oct-2014 1:22 pm, åé°å¸ notificati...@github.com wrote:
@mridulm https://github.com
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/2742#issuecomment-58724312
This needs to be configurable ... IIRC 1.1 had this customizable.
Different limits exist for vm vs heap memory in yarn (for example).
---
If your project is set up
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/2742#issuecomment-58728241
With 1.1, in expts, we have done both : depending on whether our user code
is mmap'ing too much data (and so we pull things into heap .. using
libraries not in our
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/2742#issuecomment-58728319
Note: this is reqd since there are heap and vm limits enforced, so we
juggle available memory around so that jobs can run to completion!
On 11-Oct-2014 4:56 am
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/892#discussion_r13596860
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -153,8 +153,8 @@ private[spark] class TaskSetManager
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/892#discussion_r13597675
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -388,7 +386,7 @@ private[spark] class TaskSetManager(
val
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/892#discussion_r13598180
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -388,7 +386,7 @@ private[spark] class TaskSetManager(
val
Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/892#discussion_r13601836
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -388,7 +386,7 @@ private[spark] class TaskSetManager(
val
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/892#issuecomment-45732793
Just wanted to drop a quick node (since I might not be able to get to this
until late next week).
I think the proposal should work : though I might be missing
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/900#issuecomment-45779752
This one slipped off my radar, my apologies.
@tgravescs In #892, if there is even a single executor which is process
local with any partition, then we start waiting
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/900#issuecomment-45780405
Hit submit by mistake, to continue ...
The side effect of not having sufficient executors are different from #892.
For example,
a) the default parallelism in yarn
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1091#issuecomment-46125063
Use spark.rdd.compress = true for compressing serialized RDD.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1091#issuecomment-46481279
Misread the PR and confused it with another pull request, ignore my earlier
comment.
---
If your project is set up for it, you can reply to this email and have your
1 - 100 of 1286 matches
Mail list logo