Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1113#issuecomment-55746200
@andrewor14 i update it with your comments.thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1113#issuecomment-55842735
@vanzin @andrewor14 i think using spark-submit --jars path must be on local
slave node. if path is on hdfs, it cannot download from hdfs to local path on
slave
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/1528
use config spark.scheduler.priority for specifying TaskSet's priority on
DAGScheduler
https://issues.apache.org/jira/browse/SPARK-2618
You can merge this pull request into a Git repository
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1528#issuecomment-49756393
It add user defined priority to FIFO. If user don not configure priority,
it work as before. It is non-preemptive.when there has free executors we can
let high
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1528#issuecomment-49830100
@markhamstra @CodingCat thank you for comments, i updated patch, please
review again.
---
If your project is set up for it, you can reply to this email and have
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1528#issuecomment-49833109
@markhamstra thank you. i update patch. have more comments?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1528#issuecomment-49846832
@markhamstra thank you.how about latest code?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1528#issuecomment-49849632
@markhamstra @pwendell i have updated SPARK-2618, please take a look.
thanks
---
If your project is set up for it, you can reply to this email and have your
reply
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1384#issuecomment-49860984
i think you can add jobid to stageTable. because jobid is very useful when
a application has many jobs.that can distinguish every job's stages.
---
If your project
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1528#issuecomment-49862687
i donot think priority is useful for FAIR scheduler. on YARN scheduler
priority is work with FIFO and not with FAIR. so i think spark application's
scheduler mode
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1528#issuecomment-49864527
maybe i misunderstand you. with FAIR Schedulable.weight can replace
priority. you mean with FAIR we can provide weight config to user?
example:spark.scheduler.weight
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1545#issuecomment-49869232
i think we can add jobid to stageTable. because jobid is very useful when a
application has many jobs.that can distinguish every job's stages.
---
If your project
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1180#issuecomment-49883664
i think a long-running application,sometimes there has
maxNumExecutorFailures because yarn's reason, but yarn quickly provide spark
to new container.although
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1180#issuecomment-49886521
@tgravescs i think if yarn will give application more executors,
application will continue work and it donot need maxNumExecutorFailures. i
think
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1180#issuecomment-49890478
thank you.but i think some errors like disk failures,machines go down
cannot include it.so maybe we need to exclude some errors that yarn report.how
about your idea
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1180#issuecomment-49892019
i know. i am ok for this PR.thanks
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/1549
through shuffling blocksByAddress avoid much reducers to fetch data from a
executor at a time
like mapreduce we need to shuffle blocksByAddress.it can avoid many
reducers to connect a executor
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1572#issuecomment-50126087
ok, i understand your idea.the current implementation is let the remaning
tasks run.but that has a problem if one of remaning tasks is writing hdfs and
other new
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1113#issuecomment-50126195
@andrewor14 can you take a look at this? thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/1589
[SPARK-2687] [yarn]amClient should remove ContainerRequest
in https://issues.apache.org/jira/browse/YARN-1902, after receving
allocated containers,if amClient donot remove ContainerRequest,RM
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1579#issuecomment-50146398
i want to know how much memory that @shivaram said before. @concretevitamin
can show the number?thanks.
---
If your project is set up for it, you can reply
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1384#issuecomment-50149066
@tsudukim yes,SPARK-2298 is that i want to. but i think a simple way is on
this PR add a jobid column to stage table.it is very easy to achieve it.
---
If your
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1549#issuecomment-50293186
good question. i understand more and think that this PR is unnecessary.
thank you.
---
If your project is set up for it, you can reply to this email and have your
Github user lianhuiwang closed the pull request at:
https://github.com/apache/spark/pull/1549
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1589#issuecomment-50475830
@witgo @andrewor14 please take a look at it. thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/1632#discussion_r15573389
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockFetcherIterator.scala ---
@@ -117,31 +121,45 @@ object BlockFetcherIterator
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/1113
add ability to submit multiple jars for Driver
add ability to submit multiple jars for Driver
You can merge this pull request into a Git repository by running:
$ git pull https
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/1114
discarded exceeded completedDrivers
When completedDrivers number exceeds the threshold, the first
Max(spark.deploy.retainedDrivers, 1) will be discarded.
You can merge this pull request
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1114#issuecomment-48995170
@andrewor14 i have created a jira issue SPARK-2302. yes, it is for
reducing Master's memory. Thank you.
---
If your project is set up for it, you can reply
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1114#issuecomment-49173609
@CodingCat thanks,i have created a jire issue
https://issues.apache.org/jira/browse/SPARK-2524
---
If your project is set up for it, you can reply
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/1443
[SPARK-2524] missing document about spark.deploy.retainedDrivers
https://issues.apache.org/jira/browse/SPARK-2524
The configuration on spark.deploy.retainedDrivers is undocumented
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1443#issuecomment-49249536
@pwendell thanks. i address your comments.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/1385#discussion_r15055059
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -128,25 +123,13 @@ class HadoopRDD[K, V](
// Returns a JobConf
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/997#discussion_r15095873
--- Diff: bin/compute-classpath.sh ---
@@ -30,6 +30,11 @@ FWDIR=$(cd `dirname $0`/..; pwd)
# Build up classpath
CLASSPATH=$SPARK_CLASSPATH
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/1309#discussion_r15113632
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala ---
@@ -217,6 +223,7 @@ private[ui] class StagePage(parent: JobProgressTab
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1437#issuecomment-49439644
but when broadcoast's size 1G, TorrentBroadcast has a array size exceeds
error.in Utils.serialize() will transfer object to Array[Byte]. when broadcoast
object's
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1253#issuecomment-49542251
how about -Dspark.app.name=blah? because in jvm or Hadoop, they use -D flag
to represent conf properties.
---
If your project is set up for it, you can reply
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/634#discussion_r15156184
--- Diff:
yarn/common/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientClusterScheduler.scala
---
@@ -37,14 +37,4 @@ private[spark] class
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/634#discussion_r15156221
--- Diff:
yarn/common/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
---
@@ -30,6 +30,11 @@ private[spark] class
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1113#issuecomment-54457835
@JoshRosen @andrewor14 I have update comment. has any question for it?
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/792#issuecomment-55122492
@ankurdave yes, i agree with you. like this:
def toEdgePartition(sort: Boolean = true): EdgePartition[ED, VD] = {
}
---
If your project is set up
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/3061
[SPARK-4195][Core]retry to fetch blocks's result when fetchfailed's reason
is connection timeout
when there are many executors in a application(example:1000),Connection
timeout often occure
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/3138
[SPARK-4249][GraphX]fix a problem of EdgePartitionBuilder in Graphx
at first srcIds is not initialized and are all 0. so we use
edgeArray(0).srcId to currSrcId
You can merge this pull request
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1589#issuecomment-62085093
@tgravescs i have take a look at the latest version and make sure that
problem still exist. because when amClient receive containers from YARN's RM,
amClient need
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1589#issuecomment-62489097
yes, the scenario that your said is one of situation.other is:
spark requests 2 containers (yarn request total = 2)
yarn allocates 2 and give
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1589#issuecomment-62720066
I think RM will allocate more than one to spark's AM when a executor fails.
Here is a scenario:
1. spark requests 3 containers (AM RM request total = 3)
2
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/3245
[SPARK-2687] [yarn]amClient should remove ContainerRequest
in https://issues.apache.org/jira/browse/YARN-1902, after receving
allocated containers,if amClient donot remove ContainerRequest,RM
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1589#issuecomment-62895644
yes,i create new PR:https://github.com/apache/spark/pull/3245 for the
latest code.
---
If your project is set up for it, you can reply to this email and have your
Github user lianhuiwang closed the pull request at:
https://github.com/apache/spark/pull/1589
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3245#discussion_r20504747
--- Diff:
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala
---
@@ -43,10 +44,20 @@ private[yarn] class
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3245#discussion_r20559836
--- Diff:
yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala
---
@@ -43,10 +44,20 @@ private[yarn] class
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/3403
[SPARK-4534][Core]JavaSparkContext create new constructor to support
preferredNodeLocalityData with YARN
create new constructor to support preferredNodeLocalityData with YARN
example
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/3430
[Spark 2387][Core]remove stage barrier
based on https://github.com/apache/spark/pull/1328.
when one task of parent stage is not finished, so other executors is idle.
we can pre-start
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3409#issuecomment-64192799
i think spark.yarn.driver.extraJavaOptions or
spark.yarn.executor.extraJavaOptions is better.
---
If your project is set up for it, you can reply to this email
Github user lianhuiwang closed the pull request at:
https://github.com/apache/spark/pull/3403
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3403#issuecomment-64429299
OKï¼i will close thisãthanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3465#issuecomment-64516973
thanks aaron. that 's very great.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/636#discussion_r12268618
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
---
@@ -523,6 +504,81 @@ private[spark] class Master
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/636#discussion_r12310911
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
---
@@ -523,6 +504,89 @@ private[spark] class Master
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/636#discussion_r12310983
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
---
@@ -523,6 +504,89 @@ private[spark] class Master
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/636#discussion_r12507640
--- Diff:
core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala ---
@@ -28,6 +28,7 @@ private[spark] class ApplicationDescription
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/636#discussion_r12509207
--- Diff:
core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala ---
@@ -28,6 +28,7 @@ private[spark] class ApplicationDescription
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/731#discussion_r12564010
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
---
@@ -466,30 +466,14 @@ private[spark] class Master(
* launched
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/636#discussion_r12472994
--- Diff:
core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala ---
@@ -28,6 +28,7 @@ private[spark] class ApplicationDescription
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/636#discussion_r12464105
--- Diff:
core/src/main/scala/org/apache/spark/deploy/ApplicationDescription.scala ---
@@ -28,6 +28,7 @@ private[spark] class ApplicationDescription
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/827#issuecomment-44000844
great zhpengg in our environment it appears this problem and we fix it. so
we should quickly merge into 1.0 release
---
If your project is set up for it, you can
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/864
bugfix worker DriverStateChanged state should match DriverState.FAILED
bugfix worker DriverStateChanged state should match DriverState.FAILED
You can merge this pull request into a Git
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/864#issuecomment-44158350
can anyone verify this patch?thanks
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user lianhuiwang closed the pull request at:
https://github.com/apache/spark/pull/1113
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/1113#issuecomment-67427901
yes, i think we can close this PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3061#discussion_r22245434
--- Diff:
core/src/main/scala/org/apache/spark/network/nio/NioBlockTransferService.scala
---
@@ -39,6 +41,10 @@ final class NioBlockTransferService(conf
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3061#discussion_r22245488
--- Diff:
core/src/main/scala/org/apache/spark/network/nio/NioBlockTransferService.scala
---
@@ -121,8 +132,34 @@ final class NioBlockTransferService
Github user lianhuiwang closed the pull request at:
https://github.com/apache/spark/pull/3245
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3245#issuecomment-68059290
ok, i will close this PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3061#issuecomment-68059363
OK. I will close this PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user lianhuiwang closed the pull request at:
https://github.com/apache/spark/pull/3061
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3797#issuecomment-68103677
@XuTingjun yes, i agree with you. we should let parseArgs before using
config amMemory and executorMemory. because parseArgs can change these value
from args
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/3828
[SPARK-4994][network]Cleanup removed executors' ShuffleInfo in yarn
shuffle service
when the application is completed, yarn's nodemanager can remove
application's local-dirs.but all executors
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4258#issuecomment-72780420
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/4369
[SPARK-5593][Core]Replace BlockManagerListener with ExecutorListener in
ExecutorAllocationListener
More strictly, in ExecutorAllocationListener, we need to replace
onBlockManagerAdded
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4309#issuecomment-72455127
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3976#issuecomment-72578648
@andrewor14 thank you, About SPARK_HOME, we need to consider two places of
compatible. The first is communication between python context and Scala
context
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3903#discussion_r23982799
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala ---
@@ -52,11 +52,7 @@ class BlockManagerMasterActor(val isLocal
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4409#issuecomment-73188040
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/4367
[SPARK-5529][Core]Replace blockManager's timeoutChecking with executor's
timeoutChecking
the phenomenon is:
blockManagerSlave is timeout and BlockManagerMasterActor will remove
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/4430
[SPARK-5653][YARN] In ApplicationMaster rename isDriver to isClusterMode
in ApplicationMaster rename isDriver to isClusterMode,because in Client it
uses isClusterMode,ApplicationMaster should
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4141#discussion_r24241097
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala ---
@@ -124,10 +123,12 @@ private[yarn] class YarnAllocator
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4066#discussion_r24240721
--- Diff: core/src/main/scala/org/apache/spark/TaskEndReason.scala ---
@@ -148,6 +148,20 @@ case object TaskKilled extends TaskFailedReason
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4066#discussion_r24315166
--- Diff: core/src/main/scala/org/apache/spark/SparkHadoopWriter.scala ---
@@ -105,24 +106,61 @@ class SparkHadoopWriter(@transient jobConf: JobConf
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3765#discussion_r22984790
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala ---
@@ -153,498 +154,241 @@ private[yarn] class YarnAllocator
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3765#discussion_r22984854
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala ---
@@ -153,498 +154,241 @@ private[yarn] class YarnAllocator
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/4070
[SPARK-4630][Core]Dynamically determine optimal number of partitions
stages in application have different size of data. if user doesnot set
numPartitions for any stages, spark will use same
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4051#issuecomment-70606552
when we set initial number to min, there is a delay to request more
executors. because in ExecutorAllocationManager's addExecutors
numExecutorsToAdd starts from 1
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4070#issuecomment-70489709
@rxin yes, some of etl jobs that has groupby and join operators have been
tried to use this feature.most of time that can determine number of partition
very well
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/4061
[SPARK-5266][Yarn]AM's numExecutorsFailed should exclude number of
killExecutor
when driver request killExecutor, am will kill container and
numExecutorsFailed will increment. when
Github user lianhuiwang closed the pull request at:
https://github.com/apache/spark/pull/4061
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4061#issuecomment-70089056
i find that when driver request killExecutor,numExecutorsFailed would not
increment. so i will close this PR.
---
If your project is set up for it, you can reply
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3976#issuecomment-71034833
yes, this PR is for batch mode by py file. so i think yarn-client mode is
enough for interactive /bin/pyspark. but sometime batch python application
need to run
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4134#discussion_r23381894
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -349,34 +349,7 @@ class DAGScheduler(
}
private
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3976#issuecomment-70770431
yes, we just specify .py file or primaryResource file via spark-submit,
with this PR we can make pyspark run in yarn-cluster mode.
example:
spark-submit
1 - 100 of 515 matches
Mail list logo